A PROFILING AND PERFORMANCE ANALYSIS BASED SELF-TUNING SYSTEM FOR OPTIMIZATION OF HADOOP MAPREDUCE CLUSTER CONFIGURATION By

نویسندگان

  • Dili Wu
  • Aniruddha Gokhale
  • Yi Cui
چکیده

ii To my family and friends for their support and prayers over the years iii ACKNOWLEDGMENTS First of all, I would like to thank my advisor, Dr. Aniruddha Gokhale for his encouragement, guidance and support during my time at Vanderbilt. He is not only an advisor for my research, but also the mentor for my life. Thank you for all doors you have opened and the help you have done for me. Then I would thank Dr. Yi Cui for his time and support in reviewing this thesis. Also, I would like to thank all members in DOC group for discussing and sharing their opinions with me since the day I joined.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance Analysis of MapReduce Program in Heterogeneous Cloud Computing

The research of Hadoop is an important part of cloud computing industry, and Hadoop performance research is a key research direction. The Hadoop performance analysis as a basic work can provide important reference for other performance optimization researches. In this paper, based on previous researches of server performance analysis, we propose a node performance measurement method on Hadoop. ...

متن کامل

HiTune: Dataflow-Based Performance Analysis for Big Data Cloud

Although Big Data Cloud (e.g., MapReduce, Hadoop and Dryad) makes it easy to develop and run highly scalable applications, efficient provisioning and finetuning of these massively distributed systems remain a major challenge. In this paper, we describe a general approach to help address this challenge, based on distributed instrumentations and dataflow-driven performance analysis. Based on this...

متن کامل

PStorM: Profile Storage and Matching for Feedback-Based Tuning of MapReduce Jobs

The MapReduce programming model has become widely adopted for large scale analytics on big data. MapReduce systems such as Hadoop have many tuning parameters, many of which have a significant impact on performance. The map and reduce functions that make up a MapReduce job are developed using arbitrary programming constructs, which make them black-box in nature and therefore renders it difficult...

متن کامل

Profiling and evaluating hardware choices for MapReduce environments: An application-aware approach

The core business of many companies depends on the timely analysis of large quantities of new data. MapReduce clusters that routinely process petabytes of data represent a new entity in the evolving landscape of clouds and data centers. During the lifetime of a data center, old hardware needs to be eventually replaced by new hardware. The hardware selection process needs to be driven by perform...

متن کامل

Log-based Approaches to Characterizing and Diagnosing MapReduce Systems

MapReduce programs and systems are large-scale, highly distributed and parallel, consisting of many interdependent Map and Reduce tasks executing simultaneously on potentially large numbers of cluster nodes. They typically process large datasets and run for long durations. Thus, diagnosing failures in MapReduce programs is challenging due to their scale. This renders traditional time-based Serv...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013